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TIME INHOMOGENEOUS MARKOV CHAINS 
WITH WAVE-LIKE BEHAVIOR 

By L. Saloff-Coste 1 and J. Zuniga 2 

Cornell University and Stanford University 

Starting from a given Markov kernel on a finite set V and a bijec- 
tion g of V, we construct and study a time inhomogeneous Markov 
chain whose kernel at time n is obtained from K by transport of 
g n . We show that this construction leads to interesting examples, 
and we obtain quantitative results for some of these examples. 

1. Introduction. In [15, 17, 18], we considered the problem of obtaining 
quantitative results describing the ergodic behavior of time inhomogeneous 
finite Markov chains. In general, a time inhomogeneous Markov chain, say 
on a finite set V, is described by a sequence of Markov kernels At 
time n, the distribution of the chain started at x is denoted by Ko in (x,-). 
More generally, for n < m, we define K n ^ m inductively by K n ^ n = I (the 
identity matrix) and 

K n ,m(x,y) = s ^K n>m _ 1 (x,z)K m (z,y), x,y 6 V. 

z 

If each Ki is irreducible and aperiodic, one expects that, in many cases, 
the Markov chain driven by this sequence will have the property that 

Vx,y \\K 0>n (x,-) -K 0)n (y, - ) 1 1 TV asn-^oo. 

We call this property total variation merging and say that the chain driven 
by the sequence (Ki)^ is merging. Note that, in general, Kq^{x,-) does 
not tend to a limiting distribution. However, when merging occurs, the 
chain does forget where it started: asymptotically, the distribution sequence 
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evolves in time following a well-defined pattern which is independent of the 
starting distribution. 

In this paper, we will mostly discuss a stronger notion which we call 
relative-sup merging. By definition, the sequence is merging in relative- 

sup if 



max 

x,y,z£V 



K Jx,z) 



Ko, n (y,z) 



1 







as n 



oo. 



In general, the relative-sup distance between two measures \i and v (on a 
finite or countable state space) is defined by (note the asymmetry) 



max 

x€V 



H(x) 



v{x) 



1 



In particular, for a time inhomogeneous chain driven by a sequence {Ki 
of Markov kernels, we will consider quantities such as 



1 



max 

x,zeV 



K Qt n(x,z) 



where \i n = hqKq^ for some starting measure hq. For e > 0, we also define 
the e relative-sup merging time T oc (e) by 



mm< n 



: max 

x,y,z£V 



< £ 



See [17] for more details. 

Background and general results concerning time inhomogeneous Markov 
chains are described in [10, 14, 19] where further references can be found. 
It turns out that the study of merging is difficult, both at the qualitative 
and the quantitative level, except in the special but interesting case when 
all the kernels in the sequence (-fQ)f° share the same stationary probabil- 
ity measure. See, for example, [3, 8, 13, 15]. Only a small set of examples 
have been treated in the literature mostly because proving anything about 
concrete time inhomogeneous Markov chains is difficult. 

This paper describes a special class of examples whose structure is, in 
itself, quite interesting and for which some results can be obtained. The set 
up is as follows. On a finite or countable set V, we are given a Markov kernel 
K and a bijection g : V — > V. We then consider the time inhomogeneous 
Markov chain driven by the sequence of the kernels 

Ki(x, y) = Ktf^x, g^y), x, y G V,i = 1, 2, . . . . 

The problem is to study this time inhomogeneous chain and its merging 
properties. As we shall see, this covers some interesting examples and leads 
to interesting results as well as difficult open problems. 
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The examples discussed in this paper can serve to illustrate the techniques 
developed in [17, 18]. In particular, we will make use of the following basic 
singular value technique. See [1] and Theorem 3.2 of [17]. 

Theorem 1.1. Given a sequence of Markov kernels Ki, i = 1,2, . . . , on 
a set V and a positive probability measure no, set [i n = n§K^ n and let a\(i) 
be the second largest singular value of the operator Ki:£ 2 (ni) — > £ 2 (/ij_i). 



This good-looking result is deceptive because, unless one can get some 
control on the sequence of measures fj, n , it is essentially useless. Note in 
particular that o~i(n) depends very much on /i n _i and ji n . 

2. Stability. It is well established that the stationary distribution of an 
irreducible aperiodic time homogeneous Markov chain plays a crucial part 
in the analysis of the ergodic properties of the chain. Not much can be said 
unless one can get some control on the stationary distribution. Moreover, 
unless the chain is reversible or some algebraic miracle occurs, the compu- 
tation of the stationary measure is a difficult problem. 

The situation for time inhomogeneous Markov chains is much worse. In 
order to understand how the chain behaves when started from an arbitrary 
distribution, it is crucial to find (at least) one initial distribution such that 
sequence of probability measures fi n = /zo-^o,n is somewhat well behaved. 
The ideal situation is when there is a ir such ttKq^ = ir. This occurs if 
an only if all Ki admit the same invariant measure 7r, a rather fortunate 
but rare circumstance. The next definition, taken from [17], introduces a 
property that is an obvious weakening of the existence of a common invariant 
measure. 

Definition 2.1. Fix c > 1. A sequence of Markov kernels (-f£T n )J° on a 
finite set V is c-stable if there exists a measure no such that 

(2.1) V?i>0, xeV c-^^-Kc, 

where fi n = HqKq^. If this holds, we say that {K n )f is c-stable with respect 
to the measure fiQ. 



Then 



K 0in (x,z) 



- 1 < 




We refer the reader to [17, 18], for examples, and results involving c- 
stability. The idea behind this definition is that, if a sequence is c-stable 
with respect to a probability measure /iq, then one can study the merging 
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of this sequence more or less as one would study the ergodicity of a time 
homogeneous chain with invariant measure /j,q. Why this is true is not ob- 
vious and the required technical details are quite intricate. Precise results 
in this direction are described in [17, 18]. We think that c-stability is an 
interesting property in itself and that it deserves some attention. Note also 
that, even for a fixed sequence (Ki)^° on a fixed finite state space, c-stability 
is a nontrivial property. The case of the two point space is treated in [17]. 

A special case of interest to us here is when the time inhomogeneous 
Markov chain is driven by a sequence (-fQ)f° that is periodic in the sense 
that there is an integer k such that 

Mi K i+k = Ki. 

In such case, there is an obvious candidate for a "good" starting distribution 
fiQ, namely, the invariant measure it of K% ■ ■ ■ = Kq^. Indeed, if we pick 
/^o = 7T then the sequence pi n = /io-Ko,n is also periodic of period k. If we can 
compute 7T, this might allow us to investigate the property of the sequence 
[i n including c-stability. Note however that in many examples of interest, 
the period k will grow with the size of the state space V so that, even in 
that case, investigating c-stability in a meaningful way is difficult. 

An example of this type is cyclic to random transpositions. On V = S n , 
the symmetric group, let Qi be the Markov kernel Qi(x,y) = 1/n if y = x or 
if y = x(i,j) for some j ^ i and Qi(x,y) = otherwise. Here stands for 
the corresponding transposition. This kernel corresponds to "transpose the 
card in position i with the card in a uniformly chosen position." The cyclic- 
to-random transposition chain is driven by the sequence of kernels {Ki)f 
with Ki = Qimodn (by definition, Qq = Q n ). See [8, 13, 15]. Of course, in 
this example, the uniform measure is invariant for all Qi. Other examples 
of periodic time inhomogeneous chains are discussed in [3]. 

3. Periodic waves. We now describe in detail the construction outlined 
in the introduction. This construction is of a rather general nature and pro- 
duces periodic time inhomogeneous Markov chains that reduce, in a sense, 
to time homogeneous chains. 

Let K be a Markov kernel on a finite state space V, and let g : V — > V, 
x i— > g(x) = gx be a bijection. The order of the map g is 

k = min{n € N : Vx £ Vg n x = x}, g n = g°9°-'- o g- 

For all x, y G V, set 

(3.1) K i (x,y) = K(g i - 1 x,g l ~ 1 y) 

so that K = K\ . Consider the inhomogeneous Markov chain driven by the 
sequence (-fQ)i° defined above. It is easy to see that all Ki are irreducible 
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aperiodic kernels if and only if K is. Moreover, if K has stationary dis- 
tribution 7r then Ki has stationary distribution 7Tj where 7Tj(x) = n{g' l ~ l x). 
Obviously, the sequence (-K"i)i° is periodic of period k. Examples are dis- 
cussed below after we discuss some general properties of these chains. Given 
this definition, the obvious question we face is the following: How are the 
(quantitative) merging properties of the chain driven by (-fQ)i° related to 
the (quantitative) ergodic properties of the chain driven by K? 

Proposition 3.1. Set 
(3.2) K(x,y) = K(x,g~ 1 y), 

where g~ 1 : V — > V is the inverse of the map g. Then K$^ n is given by 

K Q , n (x,y) = K n (x,g n y). 

Proof. We proceed by induction. For n = 1 the result holds by defini- 
tion. Assume that K n (x,y) = Ko^ n (x,g~ n y). Then we have 



K n+l (x,y) = ^2K n (x,z)K(z,y) 



K ,n (x, g- n z)K n+l (g~ n z, g~ n ' l y) 



This gives the desired result. □ 



Corollary 3.2. The kernel K is irreducible aperiodic if and only if 
there exists an integer uq > such that for all x,y € V, Ko^ no (x,y) > 0. 

The following examples illustrate some of the subtleties of this construc- 
tion. 



1/2 1/2 1/2 

1/2"* l/ X 3 / "^4 
1/2 1/2 1 



K = K 1 K 2 
Fig . 1 . Graph structure for kernels K and K2 ■ 
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1/2 1/2 



Fig. 2. Graph structure for K . 

Example 3.1. Let # be irreducible, periodic of period k, with peri- 
odicity classes Co, . . . ,Ck-i so that K(x,y) > if and only if x € d and 

y € Cj + i mo dfe. Assume that \Cq\ = • • • = | C fc x | , that is, all the periodicity 

classes have the same cardinality. Let g : V — > V be a bijection such that 
g(Ci) = Ci_i modfc . Let J<i(x,y) = K(g t - 1 x,g l ~ l y), K(x,y) =K(x,g~ 1 y) as 
above. It is clear that K (x, y) > if and only if x, y are in the same class Cj 
for some i. That is, # is not irreducible. One the other hand, for any x,y 
there exists n = n(x, y) such that #o >n (x, y) > 0. 



Example 3.2. On V = {1, 2, 3, 4}, consider the irreducible aperiodic re- 
versible kernel # given by #(1, 1) = #(1, 2) = #(2, 1) = K (2, 3) = A'(3, 2) = 
A(3,4) = 1/2, A(4, 3) = 1 and K{x,y) = 0, otherwise. Let g be the map 
that transposes 3 and 4. Then A 2 (l,l) = A 2 (l,2) = if (2,1) = A 2 (2,4) = 
A 2 (4,2) = #2(4,3) = 1/2, A 2 (3,4) = 1 and K 2 (x,y) = 0, otherwise. The 
graph structure for kernels K and A 2 is illustrated in Figure 1. It follows 
that 

A ,2n(4,4) = l, A ,2n+l(4,3)=l. 

This shows that the property that K is irreducible aperiodic does not imply 
that for each x,y there is an n = n(x,y) such that Ko tn (x,y) > 0. Further, 
K(l,l) = #(1,2) =#(2,1) =#(2,4) =#(3,2) =#(3,3) = 1/2, #(4,4) = 
1. Hence, # is not irreducible and has a unique absorbing state, namely, the 
point 4 as illustrated by Figure 2. 

This implies that the sequence #1 , # 2 , #1 , # 2 , ... is merging in total vari- 
ation, that is, #o, n (a;, z) — #o, n (y, z) — > for any x, y, z. Note that for z 7^ 4, 
we have #o, 2n (^, z) — > for any x. However, this same sequence is not merg- 
ing in relative-sup distance. Indeed, 

K 0n (x,z 



Toc(^) = min< n : max 



x,y,z 



1 



< £ > = CO 



since # , 2 n(4, 1) = and # , 2 n(l, 1) > 0. 

This gives an example of a pair #i,# 2 of reversible, irreducible and ape- 
riodic Markov kernels such the sequence #1, # 2 , #1, # 2 , . . . is not merging 
in relative-sup distance. 



TIME INHOMOGENEOUS MARKOV CHAINS WITH WAVE-LIKE BEHAVIOR 7 



Example 3.3. On the symmetric group S n , set a and a' to be the cycles 
a = (n, n — 1, . . . , 1) and a' = (n — 1, n — 2, . . . , 1) and o to be the permutation 
defined by a(i) = n — i + 1. In terms of a deck of n cards, a takes the top 
card to the bottom, a' takes the top card to the second to last position 
whereas a reverses the order of the deck. Consider the kernel K(x,y) = 1/2 
if x~ x y £ {a, a'} and otherwise, and the bijection g{x) = axa~ l , which is of 
order 2. Observe that K is irreducible and aperiodic. Note that g(a) = u" 1 
(take the bottom card and put it on top) and g(a') = (2,3, . . . ,n) (take the 
bottom card and put it in second position). From this it follows that 

K Q , 2 {x,y) = Y J K{x,z)K{g{z),g{y)) 

z 

= ( 1/4, if x~ x y G {e, (1,2), (l,n), (l,n, 2)}, 
\ 0, otherwise. 

This shows that, for all n, -Ko,2n(e, x) = unless x G B = {e, (1, 2), (1, n), (1, n, 
2)}, and -Ko 5 2n+i(e, x ) = unless x 6 <ri? U a'B. We note that describing K 
is difficult. 

Proposition 3.3. Let n be an invariant measure for K . Set 
Vx 6 V, i = 1, 2, . . . fii(x) =Tr(g l x). 

Then fa-iKi = ^. 

Proof. Indeed, we have 

^_i/^(x) = ^2fi i - 1 (z)K i (z,x) = ^2^(g i - 1 z)K 1 (g i - 1 z,g i - l x) 

z&v zev 

= ^2 ^(9 l ~ 1 z)K(g^ 1 z, jx) = K(g l x) = ^(x). 

zTv □ 

The "wave" appearing in the title of this paper corresponds to the dis- 
tribution 7r. The time inhomogeneous chain driven by the sequence (Ki)^° 
produces the wave tt, moving around in a periodic fashion under the action 
of the bijection g on the set V . Despite the similarity in names, we do not 
claim any connection of this paper with the subject of traveling waves. 

Corollary 3.4. Assume that K admits a positive invariant measure 
It. Then the sequence (K n )^° is c-stable with respect to the measure fiQ = TT 
with 

c = max{7r(g z x) /tt(x)}. 

x,i 
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The next proposition discusses the singular value decompositions of var- 
ious operators appearing in this construction. The proof is by inspection. 
We use the following notation. We assume that tt is an invariant mea- 
sure for K and that tt{x) > for all x £ V. Let &j, j = 0, . . . , | V\ — 1, 
be the singular values of K:£ 2 (tt) — >£ 2 (tt) in nonincreasing order, and let 
(0j)o 1 > (V'jOo'' 1 ' ^ e orthonormal bases of £ 2 ijr) such that Kcj)j = ajif)j 
(with oq = 1,0o = V'o = !)• We refer the reader to [17] for a detailed discus- 
sion. The orthonormal bases (0j)o *j (^3)0 ' 1 are ' respectively, eigenbases 
for A"*ET and KK*. 

Proposition 3.5. For any i = 1 G {1, . . .}, 0* (x) = 4>j(g l x), j = 0, . . . , 

\V\ — 1, croc? V'jC^) = V'jG? 1 " 1 ^); j ; = 0) • • • > |V| ~~ 1? are orthonormal bases of 
£ 2 (Hi) and £ 2 (fii-i), respectively, which provide a singular value decompo- 
sition of Ki:£ 2 (fj,i) — > £ 2 (fii-i) in the sense that =ajipj. In particu- 
lar, the singular values o~j(Ki, /Uj-i) 0/ if, : £ 2 (//j) -4 I 2 (/Ai-i) are given by 
(Tj(Ki,Hi-i) =Gj, j = 0, .. . ,JF| - 1. 

If a is an eigenvalue of K with eigenfunction oj and k is the order of g 
then a k is an eigenvalue of K\ ■ ■ ■ Kf~ with the same eigenfunction. 

This proposition illustrates clearly the difficulties that appear in relating 
the ergodic properties of the kernel K (that serves as the basic ingredient of 
this construction) to the merging properties of the sequence (Ki)^°. Indeed, 
it is rather unclear how the ergodic properties of K and the properties of 
its stationary measure it relate to (K,tt). 

In the following two examples, tt = tt is the uniform measure on V. Even 
in these cases, the above construction is quite interesting and nontrivial. 
Examples with tt 7^ tt will be discussed in the next two sections. 

Example 3.4 (Cycling for binary vectors). In this example, the kernel 
K is not irreducible. Take V = {0, 1}^ with tt being the uniform distribution 
on V . Let be the binary vector with a unique 1 in position i. Let K(x, y) = 
except if y = x or y = x + ei in which case K(x,y) = 1/2 (K randomizes the 
first binary entry of x). Let gx = (x2, ■ ■ ■ ,xn,xi) if x = (xi, . . . , xjv) (shift 
to the left). Using the definition, one checks that Ki is the Markov kernel 
that randomizes the ith coordinate. Hence, K\ ■ ■ ■ Kjy = tt (after N steps, 
we have a binary vector picked uniformly at random). 

The kernel K corresponds to randomizing the first entry and shifting left. 
Its invariant measure tt is uniform. One recovers immediately the fact that 
the uniform distribution is reached after exactly N steps. The singular values 
(= eigenvalues) of K (which is reversible) are 1 (multiplicity 2^ — 1) and 
(multiplicity 1). The kernel K has the property that K*K = K = K 2 so that 
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it has the same singular values. The operator K has two eigenvalues, and 
1, and is not diagonalizable, but K — it is nilpotent since (K — ir) N = 0. 

Example 3.5 (Cyclic-to-random transposition). See, for example, [13, 
15]. On the symmetric group S n , let K(x, y) = 1/n if y = x(l,j), j = 1, 2, . . . , n, 
and K(x,y) = otherwise (this is called "transpose top with random"). Let 
a be the cycle (1,2, ... , n) and g: S n —> S n , x \-> g(x) = axa" 1 . Observe that 
g l ((l,j)) = -Mmodra) so that K{ is "transpose i with random." Hence, 
we recover the cyclic-to-random transposition chain. 

Because tt = tt in this case, it follows that the singular values of K are 
equal to the singular values of K which can be computed by using the 
representation theory of S n . Note that, as K is reversible, the singular 
values of K are the square roots of the square of its eigenvalues, that is, 
the absolute value of the eigenvalues. In particular, u\ = 1 — 1/n and thus 
ai(Ki,ir) = o\ = 1 — 1/n for all i (see [2, 7, 15, 16]). The eigenvalues of K 
are rather mysterious, and it is not clear that K is diagonalizable. See [13] 
where the eigenvalues of K\ ■ ■ ■ K n (hence, indirectly, the eigenvalues of K) 
are investigated and used to obtain a very interesting lower bound on the 
mixing time of cyclic to random transposition. 

Propositions 3.1 and 3.5 reduce the study of the merging of the sequence 
(-fQ)i° to the study of the ergodicity of the time homogeneous Markov chain 
driven by K. More precisely, we have the following result. 

Theorem 3.6. Fix V,K,g,K and (Ki)f as above. 

(1) The sequence (Ki)^° is merging in relative-sup if and only if the kernel 
K is irreducible and aperiodic. 

(2) // K is irreducible and aperiodic, let tt be its unique invariant proba- 
bility measure and set fii(x) = 7r(g' l x), x € V. Then 



Proof. Use Propositions 3.1 and 3.5. To obtain the last inequality, use 
Theorem 1.1. Theorem 3.2 of [17] also yields additional inequality for the 
chi-square distance between Ko tn (x,-) and \i n . □ 

Remark 3.7. Example 3.2 gives an example where total variation merg- 
ing occurs, but K is not irreducible. 




where o\ is the second largest singular value of K acting on ^ 2 (7r). 
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Proposition 3.8. Assume that K is irreducible and 



min{K(x, x)} > 0. 



Then, for any bijection g of V , K is irreducible and aperiodic, and (Ki)f 
is merging in relative-sup. 

Proof. By Example 3.6 of [17] we have K 0t \ v \ (x, y)>0 for all x, y G V. 

By Corollary 3.2, this implies that K is irreducible aperiodic. By Theorem 
3.6(1), we conclude that is merging. □ 

The proof of the proposition above illustrates the surprising fact that it 
is not always advantageous to study K instead of the sequence (Ki)^°. In 
Proposition 3.8, we use the sequence (Ki) to study K\ Indeed, the chain K 
seems often difficult to study. For one thing, K is not necessarily reversible 
even if K is. In general, this means that computing tt may be difficult. Even 
when we can compute tt, it might be difficult to study the ergodicity of 
K from its definition. Consider, for instance, the case of cyclic-to-random 
transposition. In this case, tt is the uniform distribution, but K is not in- 
variant under the action of S n . In other words, the chain driven by K is 
not a random walk on S n . This makes studying K and its powers directly 
rather difficult (and, indeed, mysterious). The results obtained in [8, 13, 15] 
concerning the cyclic-to-random transposition chain are essentially obtained 
by considering the sequence (ifj)f°, not K (which, for one thing, does not 
appear in those papers). 

4. Perturbations of symmetric kernels. Let Q be a symmetric Markov 
kernel on a finite set V, that is, Q(x,y) = Q(y,x) for all x,y E V. This 
kernel has the uniform distribution u = 1/\V\ as its reversible measure. Fix 
an e € (0,1) and a set AdV , and consider the kernel 



where A a is some perturbation kernel such that for all x,y GV: 

(a) Z z ^a(x,z) = 0, 

(b) A A (x,y) > -eQ(x,y) and 

(c) x<£A^A A (x,y) = 0. 

Let g be a permutation of the vertex set V and consider the sequence 
(Ki)f defined by Ki(x,y) = K(g % - 1 x,g t ~ l y). Set Kfay) = K(x,g~ l y), as 
before. Let tt be an invariant probability measure for K and set 



(4.1) 



K = Q + A A 



Hi{x) = n{g l x) 



xE V,i = 0,l,2,.... 
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Define also the symmetric kernel 

Qg(x,y) = Q(g~ 1 x,g~ 1 y). 

Consider the following two assumptions on the kernel K: 

(Al) (Irreducibility of K) For all x, y € V there exists an n = n(x, y) such 
that K n (x,y) > 0. 

(A2) (Aperiodicity of K) There exists a number N such that, for all 
m>N and all x £ V, K m (x, x) > 0. 

Recall (see Theorem 3.6) that these properties are necessary for the relative- 
sup merging of the sequence (i"Q)J°. In general, it is not obvious at all how 
they can be checked. However, if the permutation g is an automorphism of 
the graph structure on V with edge set E = {(x, y) : K(x, y) > 0}, then these 
properties reduce to the similar properties for K (see Proposition 3.1). 

The most useful technical result concerning such time inhomogeneous per- 
turbations of Q is the following comparison lemma. For more on comparison 
techniques see [4]. 

Lemma 4.1. Referring to the above setting, assume that 
(4.2) 3c >0 max{7r (x)} < cmin{7r(x)}. 

x&V x€V 

Consider the operators Q g ,K acting respectively on I 2 (u) , £ 2 (tt) . Then the 
Dirichlet forms £q*q 3 ^u of Q* g Q g on £ 2 (u) and of K*K on £ 2 (tt) 

satisfy 

( 4 - 3 ) £ Q* g QgAf' f) ^ ( 1 _ £ -j2 £ K*K,M> /) 

for any function f defined on V . 

Proof. Working on £ 2 (tt) and £ 2 (u), respectively, we compare the kernel 
K*K to the kernel Q*Q g , that is, Q*Q moved by g~ l . Write 

n(x)K*K(x,y) >-y^u{z)K{z,g- l x)K{z,g^y) 
c L — 4 

z 

^ (1^)2 

C 

2 

(1 -e) 2 
= u(x)Q* g Q g {x,y). 

The third line uses the fact that for any z, u(g~ 1 z) = u(z) = 1/\V\. □ 
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The importance of this lemma comes from the fact that Q g is simply Q 
transported by g~ l and thus has the same properties as Q. For instance, Q g 
has the same eigenvalues and singular values as Q (the eigenvectors of Q g 
are the eigenvectors of Q transported by g _1 , etc.). Similarly, Q g satisfies 
the same Nash and logarithmic Sobolev inequalities on £ 2 (u) as Q itself. By 
Lemma 4.1, these properties will be transferred to (K,tt). The following two 
propositions and assorted remarks are based on this observation. 



Proposition 4.2. Referring to the above setting, assume that (4-2) 
holds, that is, 

max{-7r(x)i < cmin{7? (x)\. 

Let o\ be the second largest singular value of Q on £ 2 (u). Then the second 
largest singular value a± of K on £ 2 (tt) is bounded by 

2l < 1-^=2^(1- 0- 1 ). 

Furthermore by Theorem 3.6 we obtain 



max 

x,z£V 



1 



1-6 



2 



<c|F| 1- v „ ' (1-<ti) 



Remark 4.3. If instead of using o\ we use the logarithmic Sobolev 
constant l(Q*Q) of Q*Q (see [6, 18] for the definition; we follow the notation 
of [18]); then we get 

l(K*K) > 1 9 1 l(Q*Q). 

In cases where a good estimate on 1{Q*Q) is known, this can, potentially, 
improved upon the merging bound stated in the corollary above. See [6, 18]. 



In the next corollary, we make use of one of the main results of [5, 18] which 
concerns the use of the Nash inequalities. In applications, the constants c, 
c\, C±, D appearing in the statement below are indeed taking fixed values 
whereas the parameter T grows with the size of the underlying state space. 
It is, in general, equal to the square of the diameter of the state space V 
equipped with the graph structure induced by the symmetric kernel Q. For 
an introduction to the use of Nash inequality in the study of ergodic Markov 
chains, see [5]. 



Proposition 4.4. Referring to the above setting, assume that there are 
constants c,c\,C\,D € (0, oo) and a parameter T > 1 such that: 
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Condition (4-2) holds, that is, 

max{7r(x)} < cmin{7f(x)|. 

The second largest singular value o~\{Q) of Q on £ 2 (u) satisfies 

n«)<i-~- 

The kernel Q satisfies the Nash inequality (all norms are w.r.t. u) 
Vf:V^V \\f\\l +1/D < C 1 T(e Q . Q (J t f) + h\f\\i) ll/lli /D 



T 

Then, for any n > 2T and x, z € V , we have 
K 0jn (x,z) 



, 16(1 + 4D) Cl c^/W \ 2g 2C1 a _ eWn _ m/< * T 
{1-ef 



PROOF. Let u = l/\V\. For any function /: V -> V we have £Q*Q <u (f, f) = 
£q*Q b ,uU J '°9~ l ) and ||/|| p = \\f o g- l \\ p for p = 1,2. Thus (£q*q 3 ,u) 
satisfies the same Nash inequality as (£q*q,u). By Lemma 4.1 and (4.2), 
this yields the Nash inequality, 

\\f\\p(J) - ^ f%*K,5F(/'/) + ^H/llf 2 (#)l 11/11^) 

for (£^^,7:). The desired result now follows by applying Propositions 3.1, 
4.2 and the results of [5]. (See also Theorem 2.5 of [18].) □ 

Observe that the conclusion can be rephrased by saying that, under the 
hypotheses made, the time inhomogeneous chain driven by (-fQ)i° has a 
relative-sup merging time at most of order T. This will be illustrated below 
in concrete examples. 

Assuming (as is natural) that we understand well the finite Markov chain 
driven by the symmetric kernel Q, the main difficulty that remains in study- 
ing the time inhomogeneous chain (^Q) J° considered in this section is to ver- 
ify the condition (4.2) for some (explicit) constant c. The following lemma 
is useful in this regard. 

Lemma 4.5. Assume that tt^u and that K satisfies the irreducibility 
condition (Al) above. Let M = m&x x {ir(x)} and m = min x {7r (x)}. Let 

A* + = LeV:Y / K{y,x)>l\, A*_ = LeV:Y / K(y,x)<l\. 

K y J I y J 

Then there are points x+ € A* + , x_ € A*_ such that n = M , tt(x-) = m. 



14 



L. SALOFF-COSTE AND J. ZUNIGA 



PROOF. Let B = {z : ^2 y K(y, z) = 1}. Let x G V be a point such that 

7r(x) = M. Then we must have ^2 y K(y,x) > 1. If ^2 y K(y,x) > 1, we are 
done. Otherwise, x £ B and we must have 7? (y) = M for all y such that 
K(y,x) > 0. Either one of these points y satisfies ^2 z K(z,y) > 1 and we 
are done, or we repeat the argument. Since K satisfies (Al) and tt ^ u, this 
process necessarily yields a point x + such that tt(x) = M and x + B. Of 
course, we must then have x+ G A* + . The same line of reasoning proves the 
existence of the desired point x_ G A*_ . □ 

Remark 4.6. Note that A* + ,A*__ are contained in the "ET-boundary" of 
A, that is in the set A* = {z : 3y G A, K(y, z) > 0}. Indeed, if x g A* then 

y y y 

(a) If we can find no such that mf{K n ° (x,y) :x,y G A*} > 5 > 0, then 
since tt = TrK n °, one obtains v?(x+) = max{7r} < <5 -1 min{7r} = 5~ 1 tt(x-). 
Unfortunately, the nature of the kernel K makes it difficult to find a suitable 
n . 

(b) A variation on this idea is as follows. Assume that, for any (x,y) G 
A* + x A*_, we can find an element b = b(x,y) such that 

K(b,x) i-E z#h ir(z,2/) 

G (0, oo) and ^ G(0,oo). 



Then for x,y G v4^_ x ylt. such that tt(x) = M and 7r(y) = m as defined in 
Lemma 4.5 we have 

tt(x) < — hr(y). 

7 " W(&,y)(l-E, /6 *(*,*))/ 

This gives max{7f } < C min{7f } with 



C = max 



(K(b,x)(l-Z^ b K(z,y)) 

iiicia. s — ^- — 

(*, W )6A;xA» I if (fe, y)(l - £^ K(z, x)) 

Note that C depends on the choice of the b(x,y) for each (x,y) G A*!, x 
A*_. Different choices of allowed 6s may yield a different constant C. If the 
location of maxvf and min7r can be determined, then there is no need to 
calculate C over all A+ x A*_. Examples using this remark are in the next 
two sections. 
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5. Cyclic edge perturbation on the circle. This section examines some 
examples of a moving wave on the circle graph. On the circle graph on 
N = 21 + 1 vertices and for e > fixed, let K be the reversible Markov 
kernel corresponding to putting weight 1 on all edges except the (0,1) edge 
which has weight 1 + e. Hence 



(5.1) K(x,y) = { 



(0, i£\x-y\^l, 

1/2, if \x-y\ = 1 and x^{0,l}, 

(l + e)/(2 + e), if (x,y)€ {(0,1), (1,0)}, 

U/(2 + e), if (x,y)£ {(0,-1), (1,2)}. 



This has reversible measure 
7r(x) = 



l/(N + e), if x^ 0,1, 

(l + e/2)/(JV + e), if z = 0,1. 

Note that this can be written as a perturbation (see Section 4) of the sym- 
metric kernel Q of simple random walk, Q(x,y) = 1/2 if \x — y\ = 1 and 
Q(x,y) = otherwise. The perturbation set A is A = {0,1} and A a = 
except for the following values: 

A A (0, 1) = A A (1, 0) = e/(4 + 2s), A A (0, -1) = A A (1, 2) = -e/(4 + 2e). 

Because N = 21 + 1 is odd, the chain driven by Q is ergodic with relative-sup 
mixing time of order iV 2 . Its singular values (i.e., eigenvalues) on i 2 (u) are 

cos (^rY j = 0,l,...,iV-l. 



V N 

In particular, the second largest is attained at j = (N — l)/2 and equals 

(5.2) /?i=cos^. 
Moreover, Q satisfies the Nash inequality 

(5.3) Vf:V^V \\ff 2 < 2 7 N 2 (£ Q * Q (f,f) + * + \\f\\^j \\f\\f. 

See, for example, Theorem 5.2 and Lemma 5.3 in [5]. 

We will investigate the general construction described earlier based on the 
kernel K above and various bijections including x i— >■ x — 1 and x i— > x + 2. 
In these two cases, we prove a merging time estimate of the type 

T^ri) <C{e)N 2 (l + log+l/r?) Vt? > 

for the associated periodic time inhomogeneous chain, but there are inter- 
esting differences in the analysis of the two chains. 

First, consider g{x) = x — 1. Then Ki is the reversible kernel corresponding 
to putting weight 1 + e on the edge (i — 1, i) mod N. The graphs for Q and 
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o o 



N - 1, 



1 



N - 1 




1 + e' 



(N + l)/2 



(N-l)/2 



(N + l)/2 



(N-l)/2 



Q: all edge weights equal 1 K^. weights equal 1 except (1,2) 

Fig. 3. The cycling edge perturbation of Q. 



K2 are given in Figure 3. The kernel K(x,y) = K(x,g 1 y) is given by 



K(x,y) 



0, 

1/2, 

(l + e )/(2 + e ), 
I l/(2 + e), 



if y {x,x - 2}, 
if y G {x, x — 2} and x {0, 1}, 
if (x,y)€ {(0,0), (1,-1)}, 
if (x,y)E {(0,-2), (1,1)}. 



A simple calculation shows that -zr is constant away from 0, 1 and that 



7r(x) 



2(l+e)/(e 2 + 2Ne + 2N), 

{e + l){e + 2)/{e 2 + 2Ne + 2N), 

{e + 2)/{e 2 + 2NE + 2N), 



if x ^ 0,1, 
if x = 0, 
if x = 1. 



This proves c-stability of the sequence (-fQ)i° with respect to /j,q = tt with 
c = 1 + e. This distribution yields the wave /Xj(a;) = Tr(g l x) created by the 
time inhomogeneous Markov chain driven by (Ki)^°. 

Using Proposition 4.2 and (5.2), this proves that the relative-sup merg- 
ing time for the sequence (-fQ)i° is bounded by Too (77) < C(e)N 2 (log iV + 
log + 1 /rf) . An improved result showing relative-sup merging in time of order 
iV 2 is obtained using Proposition 4.4 and the Nash inequality (5.3) of the 
circle graph. 

Let us now consider what happens if we choose g(x) = x + 2. In terms of 
the sequence Ki, this means that Ki now has the same perturbation as K 
but at the edge (— 2i, — 2i + 1) mod N. The kernel K is given by 



K(x,y) 



0, 

1/2, 

(l + e )/(2 + e) 
I 1/(2 + e), 



if y- x £{1,3}, 

if ?/ - x e {1, 3} and x {0, 1}, 

if (*,!,) €{(0,3), (1,2)}, 

if (x,y)e{(0,l),(l,4)}. 



Contrary to what happens with g : x 1— >■ x — 1, in the present case, there 
is no simple formula for 7? (in particular, 7r is not constant away from the 
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0.015 - 
0.01 - 
0.005 - 



I 1 1 1 1 1 1 1 1 

5 10 15 20 25 30 35 40 

Fig. 4. 7T for N = 41 and e = 1. 

perturbation). Figure 4 presents a simulation of the stationary measure tt 
for N = 41 and s = 1. 

However, it is easy to see from the linear equations defining n (i.e., from 
Lemma 4.5) that max{7r} must be attained at either 2 or 3, and min{7f} 
must be attained at either 1 or 4. Suppose they are attained at 2 and 1. As 

*0> = (£f)5f<i>+yW 

we must have 

Suppose instead the max and min are attained at 2 and 4. Then, the same 
equation gives 

that is, 

f <4) £ (lTi75> (2) - 

The case where the max and min are attained at 3 and 2 is treated similarly. 
The remaining case where the max and min are attained at 3 and 1 is slightly 
different because there is no direct relation between tt(3) and 7r(l). However, 
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tt(3) < ( , + ^ )7r(0) and 5r(0) < (1 + e/2)7?(l). 



the same line of reasoning yields 

1 + e 
l + e/2 
This shows that 

(5.4) max{7r} < (1 + e) min{7r}. 

Because of this and Corollary 3.4, the sequence (-fQ)i° is (1 + e)-stable with 
respect to tt. Applying Proposition 4.4 and (5.3) yield again a relative merg- 
ing time of order N 2 for the sequence (Ki)^°. The following theorem records 
this result in more general form. 

Theorem 5.1. Let Vn = {0, . .. , N}. Fix e > and let K be as in (5.1). 
Fix a permutation g = gN of V/v and let K{, K,ir,fii be associated to K,g 
as in Section 3. Assume that there exists c > 1 such that 

(5.5) max {7? (x)} < c min {tt (x)}. 

Then there is a constant C(e,c) such that the relative-sup merging time for 
(i^i)l° is bounded by 

T^) <C{e,c)N 2 {l + \og + lh). 

Remark 5.2. For which permutations g of the set V/v = {0, . . . , iV} does 
the conclusion of the theorem above hold? According to the theorem, it suf- 
fices to check that condition (5.5) is satisfied. For instance, (5.5) is satisfied 
if g(x) = x — 1 or g(x) = x + 2 [in fact, by symmetry, for g(x) = x ± 1, 
g(x) = x ± 2]. It is very plausible that (5.5) is always satisfied, whatever the 
permutation g is. However, this does not follow directly from an argument 
similar to the one used for g{x) = x — 1 and g(x) = x + 2. In fact, the ar- 
gument already fails miserably for g(x) = x + 3. The reader may want to 
convince herself of that. In general, we want to compare the min and max 
of tt. It is easy to see that the max is attained at either g(0) or g(l) and the 
min at either g(—l) or g(2). The case where the max and min are attained 
at either (g(0),g(2)) or (g(l),g(— 1)) can be treated as above because the 
values of tt at g(0),g(2) [resp., at g(—l),g(l)] are both related to the value 
at 1 (resp., 0). But, in the other cases, it becomes much more tricky to 
compare the max and min without further hypotheses. 

Let P be the lazy version of the kernel defined in (5.1) with 

1/2, ifx = y, 

1/4, if \x - y\ = 1 and x 7^ {0, 1}, 

(5.6) P(x,y) = { (l + e)/2(2 + e), if (x,y) € {(0, 1), (1,0)}, 
1/2(2 + e), if (x, j/) G {(0,-1), (1,2)}, 
0, otherwise. 



TIME INHOMOGENEOUS MARKOV CHAINS WITH WAVE-LIKE BEHAVIOR 19 



Let g be any permutation of the set Vn = {0, . . . , N}, and define Pi(x, y) = 
P(g l ~ 1 x, g % ~ l y) for all i = 1, 2, . . . and P(x, y) = P(x, g~ 1 y). In this case, we 
can show that condition (4.2) holds which implies a relative-sup merging 
time of order iV 2 for any permutation g. 

Theorem 5.3. Let Vn = {0, . . . , N}. Fix e > and let P be as in (5.6). 
Fix a permutation g = gN of Vn and let Pi, P, tt, ^ be associated to P, g as 
in Section 3 (replacing K by P). Then 



(5.7) 



max{7r(x)} < (1 + e) min {tt(x)}. 

x£Vm x£Vn 



Furthermore, there is a constant C(e) such that the relative-sup merging 
time for (Pi)"^ is bounded by 

T OQ (r,)<C(e)N 2 (l + log + l/rj). 



Proof. By Proposition 4.4 and (5.2)-(5.3), it suffices to prove (5.7). 
Fix a permutation g = g^ of Vn = {0, . . . , N}. The kernel P is given by 



P(x,y) 



r 1/2, 

1/4, 

(l + e)/2(2 + e) 
1/2(2 + e), 
L0, 



iix = g 1 y, 

if \x — g~ 1 y\ = 1 and x ^ {0, 1}, 
if (a, (T^G {(0,1), (1,0)}, 
if {x,g-^y)e {(0,-1), (1,2)}, 
otherwise. 



By Lemma 4.5, the maximum value of tt is attained at either g(0) or g(l) 
and the minimum at g{— 1) or g{2). Moreover, 



*GK-i)) 

5f0/(2)) 
5fO/(0)) 
5?(»(1)) 



7T(-1) 7T(-2) Tf(O) 

2 4 2(2 + e)' 

TT(l) 



^(2) tt(3) | 



2(2 + e) 



?r(0) tt(-I) (1+6)^(1) 

2 4 2(2 + e) ' 

?r(l) tt(2) (l + e)7r(0) 

2 4 2(2 + e) ' 



Note that for any of the four possible max/min pairs, the max and min 
values can be both compared via the equations above to either 7r(0) or tt(1). 
See Remark 4.6(b). For instance, suppose the max/min pair is (g(0),g(— 1)). 
Then 

5FGK0)) < ^|^(0) and 5r(0) < £±£^(^(-1)). 
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Hence, 



The other cases are similar, and it follows that max{7r} < (1 + e)min{7r}. 

□ 

6. Further examples: Single point perturbations. In the next two exam- 
ples, we consider perturbations of a symmetric kernel as described in Section 
4 but with A = {o} for some o £ V, that is, the perturbation occurs at a sin- 
gle point. In the second example, we make an additional assumption on the 
structure of the perturbation. In these cases, we are able to obtain easily 
applicable bounds. 

Example 6.1. Let Q be be a symmetric kernel as in Section 4. Fix 
e € (0, 1), and let K = Q + A Q where A G = A{ } satisfies 

-eQ(o,y) < A (o,y), ^A o (o,y) = and A o (x,y) = if x / o. 

y 

Note that K(x,y) > (1 — e)Q(x,y), and K satisfies the properties (a)-(c) 
listed jit the beginning of Section 4. Fix a permutation g of V and assume 
that K is irreducible. Then Lemma 4.5 says that the min and max of tt are 
attained respectively on A* + ,A*__ and Remark 4.6(b) gives 

(6.1) max{7?} < C min{7f}, 

where 



C = max 



K(o,x)(l-Z z ^K(z,y)) 
{x,y)eA* +X A*_ I K(o, y)(l- K(z, x)) 

K(o,g~ l x) 



< max , 
xeA* + i (1 — £)Q[o, g i x) 

1 ft= \ K{o,g- l x) 

(1-6)6' ™A%\Q(o,g- l x) 

Equation (6.1) and Proposition 4.2 now imply that the relative-sup rj 
merging time of the sequence (-fQ)i° is at most 

(6.2) — ^— (log |\^| + log + I/77), 

where a\ is the second largest singular value of the kernel Q on £ 2 (u), and 
D = D(e, 9) is a constant that depends only on e £ (0, 1) and 9 (the constant 
D can easily be made explicit). 
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Example 6.2 (Perturbation of expander graphs). Fix an integer r and 
consider a sequence Qn = (Vn,En) of regular graphs with vertex set Vn of 
size | Vat | tending to infinity and symmetric edge set En C Vn x Vn with 
(x,x) € En for all x G V/v- On each graph, consider the symmetric Markov 
kernel Q = Qn corresponding to the simple random walk on Qn- Hence, 
Qn{x,h) = 1/r if (x,y) G En and Qn(x,v) = otherwise. Let <7i(iV) be 
the second largest singular value of Qn on £ 2 (un) where un is the uniform 
probability measure on Vn- Assume that there is a constant a € (0, 1) such 
that 

(6.3) ViV l-ax(N)>a. 

This property is a strong form of the property that defines the so-called 
expander graphs (see, e.g., [11, 12] and the references therein). 

Fix an origin o = on in Vn and consider a perturbation Kn of Qn as 
in Example 6.1. Fix also a bijection qn'-Vn — >■ Vjy. For each iV, consider 
the time inhomogeneous chain on Vn driven by (Xjv,i)i° where KNi(x,y) = 
KN(g N ~ lx i9N~ 1 y)- I n this situation, (6.2) yields merging for the sequence 
(ifjv,i)i° i n order log |V/v| steps, uniformly in N. Note that this result re- 
quires the degree r of the graph to be fixed (or, at least, bounded from 
above, uniformly in N). 



Example 6.3. Here we strengthened the hypotheses and the conclusion 
in the previous example. Namely, we assume that there exists 5 € (0, 1 — 
Q(o,o)) such that 

(6.4) 0<A o (o,o)<5, ~ S ( i-^Q(J ) ) <^o(o,y)<0 My^o, 
and 

A(x, y) = if x ^ o. 

Set 

5 



(6.5) e=- 

1 - Q{o,o) 

A careful analysis of this example yields a much improved estimate for in- 
stability and the relative sup merging time when compared to the previous 
example. The difference lies in the fact that the perturbation is positive only 
at o. 

Lemma 6.1. Assume that K is irreducible. Let m = min x {7r(x)} and 
M = max x {Tr(x)} . We have that 7r(o) = M and for e as in (6.5) 

m > (1 — e)7r(o). 
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Proof. Lemma 4.5 tells us that M = tt(o) and that there exists m = 
7r(x_) for some x_ with K(o,X-) > 0. Further, 

7r(x_) = ^~^n(x)K(x,x-) 

X 

> tt(o)K(o,X-) + 7r(x_) s }^^Q{x,g~ l x-) 

> (1 - eyK{p)Q{p,g- l xJ) + tt(s_)(1 - Qto,^ 1 ^)). 
So we get 7r(x_) > (1 — e)tt(o) as desired. □ 

Example 6.4. Let Qn = (Vn,En) be a sequence of regular expander 
graphs as in Example 6.2 but with degree rjy > 3 that might depend on iV. 
Fix 5 € (0,2/3) and bijections #at :Vn — > Vat- Consider a perturbation /Cat 
of the simple random walk Qat on ^at as in Example 6.3. The constant e at 

(6.5) is en = 5(tn /(t"n — 1)) < 35/2 and the measure ttn satisfies 

max{7rAr} < (1 — 35/2)~ 1 min{7rAr}- 

Vjy V N 

It follows from this and Proposition 4.2 that the associated sequence of 
perturbed kernels )i° merges in order log |Vjv| steps. 

Example 6.5 (Sticky permutation). The following is a particular case of 
Example 6.3. It is treated in more detail in [18]. On V = S n , the symmetric 
group, let 

(l/2n, ify = x(l,j),je{2,...,n}, 

Q{x,y) = l(n + l)/(2n), iix = y. 

1 0, otherwise. 

This is the kernel of the lazy version of the random walk called "transpose 
top and random." Fix a permutation p n £ S n , 5 € (0, (n — l)/(2n)) and let 

Q(x,y), ifx^p n , 
Q(x,y) + 5, \ix = y = p n , 

Q(x,y) -5/{n- 1), if x = p n and y = x(l,j) 

for j G {2, . . . ,n}. 

In words, K is obtained from Q by adding extra holding probability at p n , 
making p n "sticky." Next, if a is the cycle (1, . . . ,n), let 

Ki(x, y) = K(a i - 1 xa- i +\a i - 1 ya- i+1 ). 

Hence Ki is Qi with some added holding at pi = a~ l+1 pa 1-1 . This is obvi- 
ously a special case of Example 6.3, and we thus have 

(6.6) max{7r} < cmin{7r}, c = (1 — 2n8/(n — 1)) _1 - 



K(x,y) 
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Hence Proposition 4.2 applies. The second largest singular value of Q is 
known to be o\ = 1 — l/(2n) (see, e.g., [2, 7, 16]). This yields an upper 
bound of order n(n log n + log + 1/rj) for the relative-sup merging time T^rf) 
of the sequence (-fQ)f° ■ This result can be improved by using the logarithmic 
Sobolev inequality technique of [18], (6.6) and Lemma 4.3. The logarithmic 
Sobolev constant l(Q 2 ) of Q 2 is of order 1/nlogn (see [6]). This yields a 
relative-sup merging time upper bound of order n((logn) 2 + log + l/rj). This 
result holds also if we replace the lazy random walk Q above by its nonlazy 
version, the usual "transpose top with random." 

A total variation merging time estimate of order n(logn + log + 1/77) is 
obtained in [18] by using Lemma 6.1 together with the modified logarithmic 
Sobolev inequality technique. The crucial point is that the modified loga- 
rithmic Sobolev constant l'(Q 2 ) of Q 2 is of order 1/n (see [9, 18]). We do 
not know how to prove this improved estimate for the nonlazy version of 
this example. 
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